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PREFACE 

The PDP-15/9 statistics package (STATPAC) is a FORTRAN-coded program used to perform 
statistical analysis on user-supplied data. STATPAC runs under control of the PDP-15 Advanced 
and Background/Foreground Monitor Systems and the PDP-9 Keyboard Monitor System, and 
requires some form of auxiliary bulk storage, such as DECtape or disk. This guide is intended 
to set forth operating procedures for the user, and does not contain detailed descriptions of 
the internal operations of the package. The Guide is organized as follows: 

Chapter 1 Introduction to STATPAC 

Chapter 2 Module Operating Procedures 

Chapter 3 Implementing and Augmenting STATPAC 

Chapter 4 Sample Operation 

Chapter 1 provides general descriptions of each of the modules in the package. Chapter 2 details 
the command dialogue and possible error messages. Chapter 3 contains information related to 
building an executable file and augmenting basic systems either through addition of user software 
modules or through expanding the hardware configuration. Chapter 4 contains sample dialogue 
and output for all STATPAC modules. The Appendix contains detailed algorithms for compu- 
tations performed within the package which will be of interest to the more demanding reader. 
Finally, a bibliography of statistical texts and appHcable manuals is included for convenient 
reference. 

No attempt is made within the Guide to educate the novice statistician. It is assumed that the 
user has a good background in statistics and can use the package as a tool to achieve the desired 
results. 



CHAPTER 1 
INTRODUCTION 

STATPAC is a FORTRAN-coded program used to perform statistical analysis on user-supplied data. The package is 
designed to run under control of PDP-15/9 monitor systems in a hardware configuration that includes 8K of core mem- 
ory, a console Teletype, a high-speed paper tape reader and punch, and two bulk storage units. Due to the Umitations of 
8K core memory, the package is divided into logical modules, each of which consists of one or more core loads (i.e., 
chains or overlays). The modules (Figure 1-1) reside on a bulk storage device (logical -4) and include CONTROL, INPUT, 
SMMRY, STPRG, and MLTRG. Basic operation of the package requires that the user supply data to the INPUT module 
which prepares standardized binary data files. The user then can, depending upon his next task, select any one of the 
modules for operation. Briefly, the SMMRY (Summary) module provides the user with a set of descriptive statistics 
based upon his input files. The descriptive statistics include mean, variance, standard deviation, standard error of the 
mean, skewness, kurtosis, maximum, minimum, range, and a correlation matrix. The other two modules (STPRG and 
MLTRG) can be selected to perform stepwise linear regression and multiple Unear regression, respectively. 

The following paragraphs provide a general description of each of the modules. User dialogues are presented in Chapter 2 
and detailed algorithms for the internal computations are given in Appendix A. 

LI CONTROL MODULE 

The CONTROL module acts as an executive routine, performing miscellaneous control functions while providing a means 
for communicafions between modules. Initially, the CONTROL module is loaded into core (see Chapter 3). Once 
loaded, it types the message 

*PROG 

The user must respond by typing one of the foUowing names: 

INPUT 

SMMRY 

STPRG 

MLTRG 

EXIT 

By responding with EXIT, the user terminates all processing by STATPAC and control is returned to the monitor. Re- 
sponding with one of the module names causes the corresponding module (or a portion of it) to be loaded from the 
STATPAC tape (logical -A), overlaying the CONTROL module. Control is transferred to the module that has been loaded, 
and it requests and obtains the remaining control parameters required to perform an analysis by conducting a dialogue 
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Figure 1-1 ST ATP AC Logic Modules, Flow Diagram 



with the user (see Chapter 2). When the selected module has completed its task, it requests that the user supply the 
name of the next module to be loaded. If the user requests the module that is already in core, the module again requests 
the required control parameters. This continues until the user requests a different module, at which time the CONTROL 
module is loaded into core and, in turn, loads the selected module. 

1.2 INPUT MODULE 

The INPUT module performs two basic tasks: 

a. Conversion of user-supphed BCD data to binary. 

b. Preparation and storage of the standardized binary data files on a file structured bulk storage device. 

The user's input data consists of observations, with each observation consisting of a number of variables. For example, 
each person Uving in a town could be considered an observation consisting of the variables age, weight, height, etc. One 
can think of a data file, then, as a rectangular array or matrix of the form: 

Variable (L) 





Variable (1) 


Variable (2) 


observation 1 


^U 


^1,2 


observation 2 


Xo, 


Xt t 



^2,L 



observation N 



Xi 



■N,l 



X 



N,2 



^N,L 



Note: 



i = i observation 
'J j = i^'^ variable 



This data file consists of N observations, with each observation consisting of L variables. One can think of the observa- 
tions as rows and the variables as columns. In using the statistics program, the user will frequently be asked: "What are 
the variables?", to which he must respond by enumerating the column ordinals of the variables he wants analyzed. In 
brief, given the subscripts 1 , 2, ..., L, where each subscript is associated with one variable, the program is interested in 
how many and which variables were chosen. 

The standardized binary data files are organized on the tape written by the INPUT module as follows: 



Unit 1 has one record which contains L, the number of variables 
in each observation and the names of each variable. 

Unit 2 through K have one record which specifies the number of 
observations within the unit, N; and N records which contain the 
values of the variables for each observation. All units, except pos- 
sibly the K , have the same number of observations. Unit K may 
have less than N observations. 



The last unit (K+1) has one record which contains to signal the 



r 



V 



UNITl 


UNIT 2 


UNITS 


UNIT 4 


UNIT K-1 


UNITK 
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1.3 SMMRY MODULE 

The SMMRY module reads data files designated by the user, analyzes the data, and outputs the following statistics for 
each variable which was selected by the user for analysis: 

Mean 

Variance 

Standard Deviation 

Standard Error of the Mean 

Skewness 

Kurtosis 

Maximum 

Miiumimi 

Range 

Correlation Matrix 

1.3.1 SMMRY Statistics 



NORMAL 
DISTRIBUTION 



1/2 of 
Area 




MEAN 



STANDARD DEVIATION 



The previous statistics are estimates of the corre- 
sponding parameters of the populations from which 
the samples were drawn. The mean serves to specify 
the "center" of the data, while the standard devia- 
tion is a measure of the scatter, or dispersion, of the 
data from the center. The variance is the square of 
the standard deviation. 



THREE NORMAL 
DISTRIBUTIONS 



a^<a^<a^ 




The figure at the left shows the changes in the shape of 
a curve effected by varying the standard deviation, a. 



Curve A 
Curve B 
Curve C 
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The skewness is used to measure the symmetry of a 
distribution about the mean. Since the normal dis- 
tribution is symmetric, skewness is used to test 
whether a distribution is not normal. 




.sKFWF.n msTRiBTiTinN 



16. the Area of 
Skewed Curve 



Mean 



The sign of the skewness statistic indicates the 
direction of the skew as seen at the right. 



No Skewness 



Positive 
Skewness 



Negative 
Skewness 




Kurtosis measures the relative concentration of 
values of a sample; i.e., about the "center", 
the "tails", and the "shoulders" of the distribu- 
tion. The illustration at the right compares 
curves with different degrees of kurtosis. 



Kurtosis: 
^•< More than Normal 




Normal 

■Less than Normal 



The maximum is the highest observed value and the minimum is the lowest observed value. Their difference is the 
range. 

The correlation matrix indicates whether any pairs of variables in a file are highly correlated. Independent variables 
which are too highly correlated should not be used in the same regression analysis problem. 
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The descriptive statistics module (SMMRY) will also enable the user to perform hypotheses testing. 

Table 1-1 
STATPAC Symbol Definitions 

Symbol Definition 



% 



^ji 



L the number of variables in a user data file 

j the ordinal of a particular variable in a file (1 < j < L) 

K the number of files in an analysis 

i the ordinal of a particular file (1 < i < K) 

Nj the number of observations in the i file 

m the observation ordinal (1 < m < N-) 

Xjjjj, the j*^ variable in the m* observation of the i^** file 
the standard deviation of the j variable in the i file 



Oji the variance of the j variable in the i^" file 



*the calculated mean of the j variable in the i^" file 

Mji *the actual mean of the j*" variable in the i^^ file 

Mj the user suppUed test mean for the j variable 

o- the user supplied test variance for the j variable 

yj the observed value of the dependent variable 

yj the predicted value of the dependent variable determined using the 

regression model 

y the calculated mean of the observed dependent variable 

bj the coefficient of the i variable in a regression model 

bp the constant term of a regression model. 

*These symbols are used interchangeably in descriptions of hypotheses. 

1.3.2 SMMRY Options 

The SMMRY module of STATPAC includes six hypothesis test options. Each option permits the user to test one or 
more actual hypotheses. The user requests a specific option in response to the initial dialogue as described in Chapter 2. 

SMMRY Option 1 allows the user to test hypotheses which relate the calculated means for variables to user-supplied test 
means. These hypothesis tests may be performed upon one or more data files (up to 10 files). The statistic calculated 
for each file, however, is independent of that calculated for any other data file. 



1-6 



ST ATP AC calculates the following t-statistic when option 1 is requested: 

tji = {(Xji-Mj)vNi}/aji 

Under the assumption that the sample came from a normal population, the user can use the statistic t:- to test hypotheses 
which relate the calculated mean of a variable (ijl-A to the user-supplied test mean for that variable (jU- ), as summarized 
below. 



Hypothesis 


Acceptance 
Criteria 


Alternative 




Hypothesis 


Mji = Mj 


■t(l-«/2)(Nj-l)<tji<tO-"/2)(Ni-l) 


Mji + Mj 


/.ji<Mj 


tji<t(l-a)(Ni-l) 


'^ji>^j 


^ji>Mj 


tji>-t(l-a)(N.-l) 


/^ji<;^j 



When the acceptance criteria is not satisfied at the user-specified significance level (a), the alternate hypothesis is accepted. 
The t-values are obtained from a statistical table using the values of Nj - l(the degrees of freedom) and the expression in 
a as the parameters for selecting the t-value from the table. 

SMMRY Option 2 allows the user to test hypotheses which relate the variance to a user-supplied test variance. These 
hypothesis tests may be performed upon one or more data files (up to 10 files). The statistic calculated for each file, 
however, is independent of that calculated for any other data file. 

ST ATP AC calculates the following chi-square statistic when option 2 is requested: 






im=l 



'Jim ji-* 



lo, 



Assuming a normal population, the xi\ statistic may then be used by the statistician to test hypotheses which relate the 
calculated variance {of-) with the uset-supplied variance {of), as summarized below. 



Hypothesis 



2 2 






J 

Acceptance 
Criteria 

X (a/2) (Nj - 1) < Xji < X (1 . all) (Nj - 1) 

2 2 
Xji<^ (l-a)(Nj-l) 

2 2 
Xji>X(a)(Nj-l) 



Alternative 
Hypothesis 



Ji 



J 



o-}>o? 



o..?<o? 



JI 



J 



When the acceptance criteria is not satisfied at the user-specified significance level (a), the alternate hypothesis is accepted. 
The chi-square values are obtained from a statistical table using the values of Nj - 1 (degrees of freedom) and the expres- 
sion in a as the parameters for selecting the chi-square value from the table. 

SMMR Y Option 3 allows the user to test hypotheses which relate the mean of a variable in one file to the mean of the 
corresponding variable (i.e., same ordinal) in another file. Thus, at least 2 files must be included in the analysis, but not 
more than 10 files may be analyzed. 
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ST ATP AC calculates the following t-statistics when option 3 is requested by the user: 



where 



s2 = {(N,.l)aj2 + (N^.l)aj2}/{N^ + N3.2} 



In the calculation, r and s vary from 1,2,...,K for each value of j. Results are provided by STATPAC for each value of j 
(i.e., for each variable being analyzed). Thus, for every value of j, there is a K x K matrix generated (where K is the num- 
ber of files in the analysis). 

Under the assumption that the samples came from normal populations with af^ = a^„, the user can perform the following 
hypothesis tests for the variables of each possible pair of files in the analysis using the statistic t- Each hypothesis test 
relates variables with the same ordinal, but contained in different files. 

Hypothesis Acceptance Alternate 

Criteria Hypothesis 



'^jr-'^js ■t(l-a/2)(Nj + N3-2)<Vs^*(l-a/2)(N^ + N^-2) ^jr^'^js 

^jr ^^js Vs ^ ^(1 - a) (Nr + N^ - 2) ''jr > ^js 

'^r ^ ^js Vs > " *(1 - a) (N^ + N3 - 2) '^jr < ''js 

When the acceptance condition is not satisfied at the user specified significance level (a), the alternate hypothesis is ac- 
cepted. The t-values are obtained from statistical tables using the values of N + N - 2 (the sum of the separate degrees 
of freedom N - 1 and N - 1), and the expression in a as the parameters for selecting the value from the table. 

SMMRY Option 4 allows the user to test hypotheses which relate the variance of a variable in one file with the variance 
of the corresponding variable (i.e., same ordinal) in a second file. Analysis of at least two files must be performed for 
this option to be executed, but no more than 10 files may be included in the analysis. 

When option 4 is requested, STATPAC computes the following F-statistic: 

Fjrs = <'j?/^js^ 

where r and s vary from 1,2,...JC for each value of j. These F-values are output by STATPAC for each variable in the 
analysis (j, where 1 < j < L) and for all combinations of values for r and s (r, s = 1 ,2,... JC). Thus, for every value of j, 
there is a K x K matrix generated (where K = the number of files in the analysis). 

Under the assumption that the samples were drawn from normal populations, the user can perform the following 
hypothesis tests for a fixed variable and for each pair of files in the analysis, using the computed statistic F- . Each 
hypothesis test relates the variance of a variable with the variance of a variable having the same ordinal, but contained 
in a different file. 
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Hypothesis Acceptance Alternate 

Criteria Hypothesis 



<^j? = ''js P(a/2)(N^-l,N^-l)<Pjrs<P(l-a/2)(N^-];^3-l) ^j? + ''j? 

'^j? < '^js Pjrs < F(l - a) (N^ - 1, N^ - 1) ^j? > '^j? 

•^jr^X^js Fjrs>F(a)(N^-l,N^-l) ^j?<^j? 

When the acceptance condition is not satisfied at the user-specified significance level (a), the alternate hypothesis is 
accepted. The F-values are obtained from statistical tables using the values of the degrees of freedom for each file, 
N - 1 and N - 1 , and the expression in a as the parameters for selecting the F-value from the table. 

SMMRY Option 5 allows the user to test the hypothesis that, for a particular variable, the means of that variable in all 
files of the analysis are equal at the user-specified significance level. Analysis of at least 2 files must be performed for 
this option to be executed, but no more than 10 files may be included. Option 5 is a generalization of option 3. 

When option 5 is requested, STATPAC computes the following F-statistic for each variable j analyzed: 

K 



^r 



2Nj(Xjj-Xj)^]/(K-l) 
i = l 



K Nj 




where 



i = l m=l 

XJ=(Z ^i)/K 
i= 1 

2 2 
Under the assumption that all samples were drawn from normal populations with equal variance (i.e., a:f = a-^ for all 

r, s = 1,2,...,K) the user may test the following hypothesis: 

Hypothesis Acceptance Alternate 

Criteria Hypothesis 



^jl"^j2"-=^jK Pj<P(l-a)(Vl,V2) ^jr^^js 

where for some rands 

V1=K-1 

K 
V2=£ (Nj-l) 
i = l 

When the acceptance condition is not satisfied at the user-specified significance level (a), the alternate hypothesis is 
accepted. The F-values are obtained from statistical tables using the values of 1 - a, VI , and V2 as the parameters for 
selecting the F-value from the table. 
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SMMRY Option 6 allows the user to perform Bartlett's test for equal variances for a particular variable in all data files 
in the analysis (normal populations are assumed). Analysis of at least 2 files must be performed for this option to be 
executed, but no more than 10 files may be included. Option 6 is a generalization of option 4. 

When option 6 is requested by the user, STATPAC computes the following chi-square statistic and correction: 



jX(K-l) 



^(logglO) 



K 




Kj Nj 



L I (Xiim-^ir)' 



i=l m=l 



Jim 



^^2^ 



2(N.-i)iogjJ5; -%^ 

i=l V=l ' 




K 

Z (Nj ■ 1) 
i=l 



C= 1 + 



(K-l)/3 



1 



(Nj-l) K 
i=l ' 



Z (Nj-l) 
i= 1 



corrected j X^K- 1) = jxV- 1) ^ ^ 

where C is the correction factor. Under the assumption that all samples are drawn from normal populations, the user 
may test the following hypothesis: 



Hypothesis 



'^jl -"il 



Acceptance 
Criteria 



'^jK 



jx|k-1)<X^ 



(l-a)(K-l) 



Alternate 
Hypothesis 

for some r and s 



where ■ x rj^ . j\ may be the corrected or the uncorrected value computed by STATPAC. 

When the acceptance condition is not satisfied at the user specified significance level (a), the alternate hypothesis is ac- 
cepted. The chi-square values are obtained from statistical tables using the values of 1 - a and K - 1 as the parameters 
for selecting the chi-square value from the table. 

For a more complete description of the options, the reader is referred to Chapter 7 of Statistics in Research by Bernard 
Ostle. 

1 .4 STPRG AND MLTRG MODULES 

STPRG denotes the Stepwise Linear Regression Module and MLTRG denotes the Multiple Linear Regression Module. 
These modules are logically separate, but still have much in common (including a similar algorithm, input/output format, 
and internal organization). Because of their similarities, these modules are described together, with differences clearly 
noted where they exist. 
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1 .4. 1 Regression Analysis 

Assuming a set of N observations, where each observation consists of L + 1 variables, consider the first L-variables to be in- 
dependent and denoted by Xj, i = 1,2,...,L. Consider the last variable to be a dependent variable, denoted y. To sum- 
marize, the data will appear as follows: 



^1,1 


^1,2 


\3 ■ 


■■ Xj^L 


Yl 


^2,1 


■^2,2 


H3 . 


.. Hl 


Y2 



%,1 %,2 %,3 ■■■ ^N,L ^N 

Now, let us assume that there exists a model (i.e., a rule, relationship, formula, or equation) which defines y as a function 
of the X's. This model can be expressed in the following manner: 

Y = Bq + BjXj + B2X2 + ... + BlXl + E 

where Bq,B, ,...,Bt are the parameters of the model, and E is a true error to compensate for any discrepancies in the 
model. The task of regression analysis is to estimate or approximate this model, as follows: 

Y = bQ + bjXj + b2X2 + •-. + bj^Xj^ + e 

where the b's are estimates of the parameters and e is the residual of the estimating model. The estimating model is ap- 
plied to all observations in the set of data as follows: 




where y • is the estimate of yj in the data, ej = yj - y'j is the residual for the i^" estimate, and N is the number of obser- 
vations. The "goodness" criteria of the estimating model is that the sum of the squares of the residuals must be a mini- 
mum (i.e., least squares). The criteria may be expressed as follows: 

N N 

y (yj-'y'i) ^ Cj -minimum 
i= 1 i=l 

A regression model with two independent variables is illustrated below. This hyperplane is determined as the best fit for 
the equation y = bg + bjXj + b2X2 by STATPAC regression analysis. The predicted values (f^), the observed values (y^), 
and the residuals (e-) are shown for three observations. 
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Because each b| in the model is raised to the first power and because there are several x's this analysis is referred to as 
multiple linear regression. At times, one may suspect that some of the independent variables do not significantly con- 
tribute to the prediction quality of the model. In such a case, one can examine the contribution of each variable to the 
model and, using the following criteria, include or exclude variables. To delete a variable, which at this point in time is 
in the model, the increase in the residual variance caused by the elimination of the variable from the model is calculated. 
If the increase is significant according to a user pre-selected level, the variable is deleted from the regression model; 
otherwise it remains in the model. To enter a variable, which at this point in time is not in the model, the decrease in the 
residual variance caused by the inclusion of this variable is calculated. If this decrease is significant, according to a user 
pre-selected level, the variable is added to the regression equation; otherwise it is not added. The technique of examining 
variables individually is denoted stepwise linear regression, (note that stepwise imphes multiple variables, hence multiple 
linear regression). 

The output of regression analysis consists of: 

a. Correlation of each independent variable with the dependent variable 

b. For stepwise regression, output is for each step or iteration and includes: 

(1) Variable entering (or leaving) the model 

(2) Sequential F-test which is compared with the user supplied value of F-IN (or F-OUT) to determine 
inclusion (exclusion) of the variable in the model 

(3) The degrees of freedom for this iteration 

(4) R-squared, the multiple correlation coefficient 

(5) The change in multiple correlation from the previous iteration 
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(6) The standard error of the dependent variable 

(7) An analysis of variance table which includes an overall F-value which is used to test the hypothesis 
that all coefficients (except the constant term) are zero 

(8) A table of variables in the regression consisiting of the respective coefficients, standard errors, and 
F-values needed to remove these terms from the model 

(9) The constant term of the model (bg) 

(10) A table of variables not in the regression model and including the tolerance level, partial correlation 
with the dependent variable, and the F-value needed to enter these terms in the model 

c. For multiple Unear regression, the output includes: 

(1) The value of R-squared, the multiple correlation coefficient 

(2) The standard error of y 

(3) An analysis of variance table, as in stepwise regression analysis 

(4) The regression coefficients with their standard errors 

(5) The constant term of the model (bg) 

1.4.2 Regression Options 

The regression analysis modules have four options and four plots which may be requested by the user. The first three 
options allow the user to perform hypothesis tests. The last option is the output of residuals. (Analyses which involve 
many observations require considerable amounts of time to output residuals.) 

Option 1 of STPRG or MLTRG allows the user to test hypotheses Hq : bj = or Hj : bj # 0. STATPAC computes the 
value of tj = bj/s.e.(bj) for i = 1 ,2,...,p - 1 . If Itjl < t(N - p,l - ^ a) , Hq is accepted; otherwise, H j is accepted. The 
hypothesis is accepted or rejected at the 100 (1 - a)% significance (confidence) level. In the expression above, 

N is the nimiber of observations in the data file 

p is defined by the model such that N-p = degrees of freedom (p is the number of coefficients in the model, 
y = bo + biyi + ... + bp.iyp.i) 

a is the user selected significance level 

Option 1 output consists of the t-value (tj) for each term in the regression model, and the degrees of freedom. 

Option 2 is a generalization of option 1 which tests the hypotheses Hq : bj = b| and Hj : b| =7^ b| where the value of h[ is 
suppUed by the user. (Thus option 1 is option 2 with h[ = 0.) STATPAC computes tj = (bj - bj)/s.e. (bj) for i = 1 ,2,...,p - 1 . 
If It- 1 < t(N - p,l - % a), the hypothesis Hq is accepted at the 100(1 - a)% significance level. Otherwise the hypothesis 
Hj is accepted. The user is asked to supply the values for bj in the command dialogue whenever this option is selected. 
Option 2 output consists of the t-values (tj) for each term in the regression model, the user suppUed coefficients from 
question 8 of the dialogue, and the degrees of freedom. 

Option 5 is a further generalization of options 1 and 2 and tests the hypothesis Hq : L < bj < H, where H and L are 
limits of the confidence interval for each coefficient in the regression model. The confidence interval is computed by 
STATPAC at the 100(1 - a) % confidence level as represented by the following equation: 

bj±t(N-p,l-%a)(s.e.(bi)) 

The value of t(N - p,l - !^ a) is supplied by the user from a t-table, for a statistical value of a , in response to STATPAC 
dialogue. Option 3 output consists of the upper and lower bounds of the confidence intervals, and the degrees of 
freedom. 
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Option 4 is requested by the user when he desires output of residual values (output is on logical 3). Regression analysis 
with many variables will require considerable time to output the residuals. The residual output for each observation in 
the original file includes the observation ordinal, the observed dependent variable (y^), the predicted value of y (y-) by 
the model, and the residual for each observation (e- = y- - y-). 

1 .4.3 Regression Plots 

Plot 1 is predicted y versus residual for each observation. The plot should appear as a broad horizontal band with no 
trend. A plot which shows an increasing broadness indicates that the variance is not constant and the regression is sus- 
pect. Each element of the grid is a counter with maximum value equal to 9. 

Plot 2 is predicted y versus observed y. The plot indicates the ability of the model to predict the observed y's. The ideal 
situation would be the straight line, y = y . As in plot 1 , each element of the grid is a counter with maximum value equal 
to 9. 

Plot 3 is residuals versus ordinal. The residuals are plotted versus the ordinal of the observation to check for time factors 
in the model. A non-horizontal plot is an indication that the model may be time dependent. That is, either variance in- 
creases with time or terms in time should have been included in the model. 

Plot 4 is an overall plot of residuals. This histogram of residuals indicates normal distribution of the residuals. 

For further description of the plots and their implications, refer to Chapter 3 of Applied Regression Analysis, by Draper 
and Smith (see Bibliography). Options 1, 2 and 3 are also described in Section 1.4 of the same reference. 
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CHAPTER 2 
MODULE OPERATING PROCEDURES 

ST ATP AC is loaded as a compiled and chained FORTRAN program (see Chapter 3 of this manual and the applicable 
Users' Guide). Once STATPAC is loaded, the CONTROL module assumes control. The command dialogue and possible 
error messages for the five modules are described in this chapter. 

NOTES 

1 . If the user has assigned the Teletype handler (TTA) to logical 5, he 
may use the RUBOUT key to erase one character to the left for each 
striking of the key. Control U may be typed to delete the whole line. 
These keys can only be used to erase up to the last carriage return. 

2. Characters typed by STATPAC are underHned in the following text 
to distinguish them from those typed by the user. 

3. Sample input and dialogue for each module is contained in Chapter 4. 
2.1 CONTROL MODULE 

2.1.1 Command Dialogue 

CONTROL contains one dialogue message which it types on logical 4. 

*PROG 

On logical 5, CONTROL expects one of the following five responses, left justified: 

INPUT 

SMMRY 

STPRG 

MLTRG 

EXIT 

2.1.2 Error Messages 

If the user does not respond with one of the above legal responses, the CONTROL module will type: 

*ERROR 
*PROG 



CONTROL then awaits the input of another module name. 
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2.2 INPUT MODULE 



2.2.1 Command Dialogue 

The execution of the INPUT module is directed by the following dialogue. The ASCII data file must be on logical 6. 
and the new file will be written on logical 1 . 



Question 1: 

*FILE (OLD) 



AAAAAAAAA 



Question 2: 

*FILE (NEW) 
AAAAA 



Question 3: 

♦FORMAT 



(82,82, ...,Sjj) 



Question 4: 

*N0. OBS. 



mil 



Question 5 : 
*VARS 



VARII=AAAAA 
VARII= AAAAA 



VARII=AAAAA 



Question 6 : 

*ZERO OBS. 



AAA 



INPUT requests the user to identify the name of his data file with its nine character 
name (i.e., six character file name and three-character extension). If the device as- 
signed to logical 6 is not a bulk storage device, this name is of no consequence and 
a carriage return or blank card will suffice. If the device is bulk storage, the nine 
character name may be obtained from the directory listing and must be given 
exactly as given on the listing. 

Question 2 of INPUT requests the name to be associated with the standardized 
binary file which it will create. The user must respond with five characters in A5 
format as described in the FORTRAN Manual. The file will be identified with this 
name for all further analysis by 8TATPAC. If a non-bulk storage device is assigned 
to logical 1 , the name is of no consequence, and a carriage return or blank card will 
suffice. 

The user must respond by typing the format needed to read one observation of his 
BCD (ASCII) data. The number of fields specified in the format must correspond 
to the number of variables per observation (see Question 5). For a discussion of 
format statements, see the FORTRAN Manual. 

INPUT requests the number of observations in the user's data file. The user responds 
with an integer value of five characters right justified in the five character field. 
(Preceding spaces or Os must be supplied by the user for numbers with less than 
five characters.) 

Question 5 requests the user to specify the names of the variables in his data file. 
He need not specify the names of all of the variables, but he must include the high- 
est variable subscript in the list which may not be greater than 15. The highest sub- 
script given by the user defines the number of variables per observation. The re- 
sponse must be in the form given below: 



Character Position 


Content 


1 through 3 
4 and 5 

6 
7 through 1 1 


VAR 
subscript II (01 <II<15) 

variable name in A5 



The list of variables is terminated by a blank record (e.g., a simple carriage return 
or blank card). 

The user is asked if he wants those observations which contain at least one variable 
printed. If the user wants these observations, he responds by typing "YES" in A3 
format. Any other response (e.g., "NO", or a simple carriage return) will suppress 
the typing of such observations. (See Section 2.2.2 for a more complete description.) 
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Comment: INPUT outputs this message to terminate the dialogue, and to indicate that proces- 

*0.K. sing will now begin. No response is necessary. 

NOTE 

Once a user BCD data file has been written in standardized binary format 
on logical 1 by INPUT, the user need not generate the binary file again. 

2.2.2 Error Messages 

If the user types an illegal subscript (greater than 15) to Question 5 (*VARS), INPUT will type the foOowing message 
and will not accept the line which was typed. 

♦IGNORED 



The user may continue with legal responses to the question. 

If a subscript less than 1 is typed, the input list is terminated and the following question will be asked (ZERO OBS.). 

If a bulk storage device is assigned to logical 6 and the user does not respond with the exact name of a file in response 
to Question 1 (*FILE (OLD)), TOPS 13 will result.^ No recovery is possible except by restarting execution of STATPAC. 
Restarting is accomplished by typing CNTRL/C after TOPS 13. The monitor from which execution may be restarted 
will be loaded. 

If a variable of a particular observation does not conform to the user-specified format (answer to the question *FORMAT), 
the variable in question may be recorded as in the standardized file created by INPUT. The question "ZERO OBS." 
allows the user to monitor values for such losses. If the user requests such output, INPUT will type the entire obser- 
vation on logical 4 together with the observation ordinal. The option cannot, however, distinguish between valid and 
assigned values. 

NOTE 

1 . Only five characters (A5) for a file name are supplied by the user in 
Question 2. STATPAC supplies lj STP as the remaining four characters 
of the file name in the buUc storage directory. 

2. All variables in BCD data on logical 6 must be real; i.e., E-, F- or 
G-type conversion. 

2.3 DESCRIPTIVE STATISTICS MODULE 

2.3.1 Command Dialogue 

Question 0: The module currently in command requests the user to specify the next module 

*PROG to be used. Assume the user answers with the name of the descriptive statistics 

SMMRY module, SMMRY. 

Question 1 : The user is asked to specify the files to be analyzed by the SMMRY module. The 

*FILE response consists of the names assigned to the files during execution of the INPUT 

AAAAA module (INPUT Question 2). Response must be in A5 format, left justified. As 

AAAAA many as ten files may be analyzed at one time. The user terminates the list of file 

\ names by supplying a blank record (e.g., a simple carriage return or a blank card). 
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AAAAA 
Consult the Users' Guide of your computer system for a description of the lOPS 13 error. 
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Question 2: 
*VARS 



II 

n 



The user is asked to list the subscripts of the variables to be analyzed. The response 
is in 12 format, right justified. At most, 15 subscripts may be listed, and the values 
of II must be 01 < II < 15. The list of subscripts is terminated by a blank record. 



II 

Question 3: 
*OPTS 



mill 



Question 4: 

*MEAN 



MENII=±XXX.XXXX 
MENII=±XXX.XXXX 



MENII=±XXX.XXXX 



Question 5 : 
*VRNC 



VARII=±XXX.XXXX 
VARII=±XXX.XXXX 



The user is requested to indicate which, if any, hypothesis tests are desired. The 
user responds with a if he does not want a specific option and with a 1 (or any 
positive integer) if he does want an option. The first position represents option 1 , 
the second position represents option 2, etc. If no options are desired, the user may 
respond with a blank record. Examples: 

001101 User requests options 3, 4, and 6. 
100000 User requests option 1 only. 

If option 1 is requested in response to Question 3, the user is requested to provide 
a test mean for each variable in the analysis. The user must respond in the following 
form. 



Character Position 


Content 


1 through 3 
4 through 5 

6 
7 through 15 


MEN 
subscript II (OK IK 15) 

test mean in F9.6 



The list is terminated by a blank record. An entry for a subscript may be retyped 
and only the last appearance will be used. If a variable which is in the analysis by 
virtue of being listed in Question 2 is not assigned a test mean, a default test mean 
of is assumed. 

The user who has requested option 2 in Question 3 is requested to provide a test 
variance for each variable in the analysis. The user must respond in the following 
form. 



VARII=±XXX.XXXX 



Character Position 


Content 


1 through 3 
4 through 5 

6 
7 through 15 


VAR 

subscript II (OK IK 15) 

test variance in F9.5 



24 



The list is terminated by a blank record. An entry for a subscript may be retyped 
after it has already been entered and only the last appearance will be used. If a 
variable in the analysis is not assigned a test mean, because it was listed in Question 2 
a default test mean of is assumed by STATPAC. 

Comment: STATPAC indicates the termination of dialogue and the start of the requested 

*0.K. analysis by typing the comment O.K. 

2.3.2 Error Messages 

If the user responds with a subscript greater than 15 in response to Questions 2, 4, or 5, that particular line is completely 
ignored but no message is typed. If the user responds with a non-positive subscript to one of these same questions, this 
is treated as a list terminator and the dialogue proceeds to the next question. 

If the user responds to Question 2 (*VARS) with a subscript which is greater than the number of variables/observations 
in some data file which he has Usted in Question 1 (*FILE), the message 

*ERR1 



will be output on logical 4. Note that the number of variables per observation is defined by the user's response to 
Question 5 of INPUT. When this error condition exists, the file in question is excluded from further analysis and pro- 
cessing continues. If all the user listed files from Question 1 are eliminated from analysis, the following message will be 
output on logical 4: 

*ERR2 

If the user requests option 3, 4, 5, or 6 and has Hsted only one file name for analysis in response to Question 1, these 
options will not be processed and no output will result, since they are meaningless for only one fUe. 

NOTES 

1 . When the elements of the correlation matrix are being calculated, the 

terms 

N N 

m=l m=l 

are used in the denominator of the expression for calculating Cj:. If both 
of these terms are not larger than TOL = .lE-9, then c- is given the de- 
fault value 2.0. 

2. If, during the processing of option 2, a user supplied variance is found 
to be less than or equal to TOL = .lE-9, the corresponding statistic is given 
the default value of 1 .E76. Similarly, if during the processing of option 1 , 

a standard deviation is calculated from the data file which is less than .TOL, 
the corresponding statistic is assigned the default value .1E76. Option 4 
operates similarly. 

2.4 REGRESSION ANALYSIS MODULES 

2.4.1 Command Dialogue 

Question 0: The module currently in command types this question. The user is assumed to have 

*PROG typed either STPRG or MLTRG. 

STPRG 
°'' MLTRG 
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Question 1: 
* FILE 

AAAAA 

Question 2: 
*VARS 
II 
II 
II 



II 

; 

Question 3: 
*FIN 

xxxx.xxxx 



Question 4: 
*FOUT 
XXXX.XXXX 

Question 5: 
* LIM 
II 



Question 6: 
* TOL 
XXXX.XXXX 



Question 7: 
*OPTS 



mi 



STPRG or MLTRG requests the name of the data file to be analyzed. The user 
responds in A5 with the exact name of the file which was given in answer to INPUT 
Question 2. 

STPRG or MLTRG requests the subscripts of the variables to be analyzed. The user 
must respond in 12, followed by a carriage return, and right justified in the two char- 
acter field. Values must be 01 < II < 15. The last subscript of the list will be con- 
sidered the dependent variable. Each subscript is terminated by a carriage return, 
and the list is terminated by a blank record (e.g., an extra carriage return or blank 
card). If all variables of the file are to be analyzed, the user need only type the sub- 
script of the dependent variable. 



STPRG requests the F-value which will be used to determine if a variable not in the 
model makes a significant contribution to the model and, therefore, should be added. 
The user responds in F9.5 followed by a carriage return. 

NOTE 
This question is asked only when the user response to Question is STPRG. 

Question 4 of STPRG requests the F-value to determine if the contribution of a vari- 
able, which is in the estimating model, is insignificant and should therefore be ex- 
cluded from the model. Response is in F9. 5. See NOTE for Question 3. 

Question 5 of STPRG requests the user to specify the number of iterations or cycles 
to be allowed in the calculation of the estimating model. The Hmit prevents STPRG 
from getting into a nonproductive loop of successively including and excluding vari- 
ables in the model. If the user responds with other than a positive integer, STPRG 
will use a default limit equal to twice the number of independent variables being 
analyzed. See NOTE for Question 3. 

Question 6 requests the tolerance factor used by STPRG and MLTRG. The tolerance 
is used to check for constant observations and to check the diagonal elements of the 
correlation matrix to avoid trying to invert a badly behaved matrix. Values for TOL 
are usually between 0.001 and 0.0001. If the user responds with a blank record, 
STATPAC uses a default value of TOL = .001 . 

STPRG or MLTRG requests the user to specify which options are desired. The four 
options are described in detail in Section 1.4.2. The user responds with a if he does 
not want an option or with a 1 if he does want the option. Examples: 

1000 Option 1 only 

1010 Options 1 and 3 only 

001 1 Options 3 and 4 only 
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Question 8: 
* COEF 

COFII=XXXX.XXXX 
COFII=XXXX.XXXX 



COFII=XXXX.XXXX 



This question is asked only if the user requests option 2 in Question 7 (i.e., he types 1 
in the second position of the response to Question 7). The user must respond as out- 
lined below. 



Character Position 


Content 


1 through 3 
4 and 5 
6 
7 through 15 


COF 

subscript II (01 <1I<15) 

test coefficient in F9.5 



Question 9: 
*FCTR 
XXXX.XXXX 



Question 10: 
*PLTS 

mi 



The list is terminated by a blank record. 

This question is asked only if the user requests option 3 in Question 7 (i.e., he types 
a 1 in the third position of the response to Question 7). The user must respond with 
the value of t(N -p,l -^Aa) in F9.5. The t-value is obtained from a table by esti- 
mating the degrees of freedom (N-p) and specifying a confidence level. The actual 
degrees of freedom are output by the regression module and may be checked against 
the estimated value used to obtain the response. 

The user is requested to specify the output plots which he would like. The plots are 
described in Section 1.4.3, and are listed below. The user types a 1 in the position 
corresponding to those plots he wants and a in the positions corresponding to plots 
he does not want. Examples: 



1000 


Plot 1 only 


0010 


Plot 3 only 


1011 


Plot 1,3, and 4 



Comment: The regression analysis modules type this message to indicate termination of the 

*0.K. dialogue and start of the processing. 

NOTE 

An "A" appears in plotted output when the value to be plotted is a coun- 
ter which exceeds 9. 

2.4.2 Error Messages 

If the user responds to Question 2 (*VARS) or Question 8 (*COEF) with a subscript value greater than 15, the Une in 
question will be ignored. If a non-positive subscript is typed, the list in question is terminated and STATPAC types the 
next question in the dialogue. 

If the following expression is less than the user-supplied tolerance for a specific variable with subscript j : 

I (xjj-^)2j^<T0L 

where N is the number of observations, the j variable is considered constant by STATPAC and the following error mes- 
sage is output on logical 4: 

*EJlRlj 
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(The value j is the subscript of the variable which caused the error.) ERR 1 will terminate processing and a new dialogue 
will begin. 

If the MLTRG module is being used and not all the independent variablescan be entered (due to the choice of TOL in 
part) the following error message will be output on logical 4 : 

*ERR2 

ERR 2 will terminate processing and ST ATP AC will begin a new dialogue. 

If the user has responded with a subscript greater than the number of variables/observation, ST ATP AC outputs the fol- 
lowing error message: 

*ERR3 

ERR 3 will terminate processing and cause STATPAC to begin a new dialogue. Note that the number of variables per 
observation is defined by the user's response to Question 5 of INPUT. 

When the STPRG module is being used, a limit factor (LIM) is used to limit the number of passes in the stepping algorithm. 
Exceeding this limit causes the following error message on logical 4: 

*ERR4 

ERR 4 terminates processing and STATPAC begins a new dialogue. 

If no variables are entered into the regression model when using either the MLTRG or the STPRG modules, the following 
message is output on logical 4: 

*ERR5 

ERR 5 terminates processing and STATPAC begins a new dialogue. 

NOTE 

When any of the above error conditions occur, the user should not in- 
discriminately adjust the values of TOL, FIN, LIM, etc., to force a com- 
plete analysis. The user should closely examine the variables and limits 
involved before making any such adjustments. 



CHAPTER 3 
IMPLEMENTING AND AUGMENTING STATPAC 



This chapter describes the procedure to be followed when building a STATPAC executable file using the PDP-15 and 
PDP-9 Monitors. Refer to CHAIN and EXECUTE of the Users' Guide for the general description for chaining the 
STATPAC program. The procedure for building an executable file with specific hardware and handler assignments is 
described later in this chapter. 

STATPAC modules make use of the following logical units. Specific device handlers must be assigned to these units when 
the executable file is built. 



Logical Unit 


Function within STATPAC 


■4 


Contains STATPAC in executable form in lOPS binary. 


1 


Contains standardized binary data files written by the INPUT module 




in lOPS binary. 


2 


Stores temporary files during processing by a STATPAC module in 




lOPS binary. 


3 


Hard copy statistical output in lOPS ASCII. 


4 


Program queries and error messages in lOPS ASCII. 


5 


User responses to queries in lOPS ASCII. 


6 


User supplied BCD data files as input to the INPUT module in lOPS 




ASCII. 


7 


Temporary storage of residuals (regression) and temporary storage of 




option output for SMMRY in lOPS binary. 



If the user assigns a bulk storage device to logical unit 3, the output of STATPAC will be recorded in files with the fol- 
lowing file names: 



Descriptive Statistics 

SMMRY STP 
OPTON STP 



Contains the standard SMMRY module output 
Contains the output of the options for SMMRY 



3-1 



Regression Analysis 

REGRS STP 

OPTPL STP 



Contains the standard regression output (for each step in the case of STPRG) 
and the residual output, if requested 

Contains the output for regression options 1, 2, and 3 and the plotted output, 
if requested 



3.1 BUILDING AN EXECUTABLE FILE 

ST ATP AC is supplied to the user in three forms: 

a. Source files for each STATPAC chain. 

b. Binary files of the FORTRAN compiled STATPAC source for each chain. 

c. An executable file with fixed handler assignments, chained as described in this section. 

The files are organized according to the following chart. 



Module 


Source File 


Binary File 


CONTROL 


CHOI SRC 


CHOI BIN 


INPUT 


CH03 SRC 


CH03 BIN 


SMMRY 


CH06 SRC 


CH06 BIN 


SMMRY 


CH07 SRC 


CH07 BIN 


SMMRY 


CH08 SRC 


CH08 BIN 


SMMRY 


CH09 SRC 


CH09 BIN 


STPRG & 


CHIO SRC 


CHIO BIN 


MLTRG 






STPRG & 


CHI 1 SRC 


CHll BIN 


MLTRG 






STPRG & 


CHI 2 SRC 


CHI 2 BIN 


MLTRG 






STPRG & 


CHI 3 SRC 


CHI 3 BIN 


MLTRG 







NOTE 

16K of core memory is required to build the STATPAC executable file, 
although it will operate in an 8K memory. 

An executable file is produced from the above chains by following the steps outhned below. The description assumes 
that the user has two DECtapes as bulk storage devices for the Monitor System. Assuming also that the compiled 
binary files are on DECtape unit I ; the user should make the following handler assignments: 

$A DTAO -1 
$A DTAl -4,-6 
SADTBl 1,2,6,7 
$ATTA 3,4,5 
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Once CHAIN has been loaded from the system tape, the message "CHAIN V2A" is typed as shown below. The user then 
types the command "BUILD STATPC", thereby identifying the executable file to be built. The user then types aU re- 
sponses which are preceded by a > in the following Usting. (Lines preceded by a > indicate user response is required. 
Lines not preceded by a > are typed by CHAIN.) 



CHAIN V2A 


>BUILD 


STATPC 


>C 1 




>CH01 




>END 




CH01 


36655 


BCDIO 


33662 


• ss 


33603 


GOTO 


33555 


STOP 


33542 


SPMSG 


33447 


FIOPS 


32713 


OTSER 


32617 


INTEGE 


32467 


REAL 


31466 


CHAIN# 


1 


LOWEST 


31466 


COMSZE 


00010 


>C 3 




>CH03 




>END 




CH0 3 


34672 


DTB. 


32647 


FILE 


31325 


.DA 


31256 


BCDIO 


26263 


BIN'IC 


26012 


.SS 


25733 


FIOPS 


25177 


OTSER 


25103 


INTEGE 


24753 


REAL 


23752 


CHAIN* 


3 


LOWEST 


23752 


COMSZE 


00010 


>C 6 




>CH0 6 




>END 




CH0 6 


35077 


DTB. 


32054 


FILE 


31532 


.DA 


31463 


BCDIO 


26470 


BINIO 


262! 7 


.SS 


26140 


GOTO 


261 12 


FIOPS 


25356 


OTSER 


25262 


INTEGE 


25132 


REAL 


24131 
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CHAIN* 


6 


LOWEST 


24131 


COMSZE 


00167 


>C 7 




>CH0 7 




>END 




CH06 


335 71 


DTB. 


30 546 


FILE 


30224 


FLOAT 


30213 


SORT 


30125 


• DA 


30056 


BINIO 


27605 


-SS 


27526 


FIOPS 


26772 


OTSER 


26676 


INTEGE 


26546 


REAL 


25545 


CHAIN* 


7 


LOWEST 


22545 


COMSZE 


00167 


>C 8 




>CH08 




>END 




CH08 


33620 


DTB. 


30575 


FILE 


30253 


ABS 


30235 


FLOAT 


30224 


SORT 


30136 


ALOG10 


301 16 


.EE 


30025 


• EC 


27761 


.DA 


27712 


BINIO 


27441 


.SS 


27362 


GOTO 


27334 


FIOPS 


26600 


OTSER 


26504 


INTEGE 


26354 


REAL 


25353 


CHAIN# 


10 


LOWEST 


25353 


COMSZE 


00167 


>C 9 




>CH09 




>END 




CH09 


3551 1 


DTB. 


3S466 


FILE 


32144 


.DA 


32075 


BCDIO 


27102 


BINIO 


26631 


.SS 


26552 


GOTO 


26524 


FIOPS 


25770 


OTSER 


25674 


INTEGE 


25544 


REAL 


24543 
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CHAIN# 11 
LOWEST 24543 
COMSZE 0^51 67 



>C 10 

>CH10 

>END 

CH10 34420 

bTB. 31375 

FILE 31053 

•DA 31004 

BCDIO 26011 

BINIO 25540 

•SS 25461 

GOTO 25433 

FIOPS 24677 

OTSER 24603 

INTEGE 24453 

REAL 23452 

CHAIN# 12 

LOWEST 23452 

COMSZE 00125 



>C 1 1 




>CH1 1 




>END 




CHI 1 


31553 


DTB. 


26530 


FILE 


26206 


ABS 


26170 


lABS 


26154 


FLOAT 


26143 


SORT 


26055 


• DA 


26006 


BINIO 


25535 


.SS 


25456 


FIOPS 


24722 


OTSER 


24626 


INTEGE 


24476 


REAL 


23475 


CHAIN* 


13 


LOWEST 


23475 


COMSZE 


00273 


>C 12 




>CH12 




>END 




CHI 2 


33003 


DTB. 


2 77 60 


FILE 


27436 


ABS 


27420 


.DA 


27351 


BINIO 


27100 


.SS 


27012 


GOTO 


26773 


FIOPS 


26237 


OTSER 


261 43 


INTEGE 


; 260 13 


REAL 


250 12 
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CHAIN* 


1 4 


LOWEST 


25012 


COMSZE 


00273 


>C 13 




>CH13 




>END 




CHI 3 


35217 


DTB. 


32174 


FILE 


31652 


.DA 


31603 


BCDIO 


26610 


BINIO 


26337 


.SS 


26260 


GOTO 


26232 


FIOPS 


25476 


OTSER 


25402 


INTEGE 


25252 


REAL 


24251 


CHAIN* 


15 


LOWEST 


24251 


COMSZE 


00025 


>CLOSE 




CHAIN V 


-?A 


>EXIT 





MOrUTOH VA^ 



NOTE 

Chain numbers are typed by the user in decimal, but CHAIN prints the 
chain numbers in octal. 

When the message "CHAIN V2A" is typed at the end of the listing, STATPC is on DECtape I in executable format. 
After calling the MONITOR, STATPC is executed by typing: 

ADTC14 
E STATPC 

NOTE 

STATPC XCT is the name of the executable file which is stored on 
logical unit 1 by CHAIN. 

The assignments made to build the executable file described in this chapter result in all files (ASCII, binary, or tempo- 
rary) being stored on DECtape 1, and all hard copy output being on the Teletype unit, which is also used for the dia- 
logue of the modules. 

The user can increase the processing and output speed by optimally assigning peripheral handlers to the STATPAC log- 
ical units. 
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3.2 ADDING PROCESSING MODULES TO STATPAC 

example, assume the user wishes to add a module which is written in one chain to STATPAC. The name of the new 
module is ABCDE. The chain must include the following statements: 



COMMON ICNTRL, CPBLTY, IFLAG, 
DATA ABCDE/SHABCDE/ 



C PROCESSING BEGINS HERE 
100 CONTINUE 

C PROCESSING IS FINISHED 

200 WRITE (4,201) 

201 FORMAT (6H*PR0G) 
READ (5,202) CPBLTY 

202 FORMAT (A5) 

IF (CPBLTY.EQ.ABCDE) GO TO 100 
C DIFFERENT MODULE REQUESTED, CALL CONTROL MODULE 
CALL CHAIN (1) 
END 

If the newly added module must read a data file which was written by INPUT, the file (on logical-1) could be read with 
the following coding: 



C READ AAAAA, WHICH WAS GIVEN BY USER IN QUEST. 2 OF INPUT. 

READ (5,100) FILE (1) 
100 FORMAT (A5) 
C FILE (2)=4H STP 

CALL SEEK (1, FILE) 

READ (1) L, (NAME(I),I=1,L) 
C INITIALIZE OBSERVATION COUNTER 

N=0 
103 READ (1) NO 

IF (NO.EQ.O) GO TO 102 

N=N+N0 

DO 101 N01=1,N0 

READ(1)(X(I),I=1,L) 
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101 CONTINUE 
GO TO 103 

C ALL OBSERVATIONS READ 

C N TOTAL NUMBER OF OBSERVATIONS 

102 CALL CLOSE (1) 



The format of the files formed by INPUT which must be read by a user written module for analysis is given in 
Section 1 .2. This format is summarized below. The file is stored on a bulk storage device with directory entry AAAAA 
STP where the name AAAAA is supplied by the user in response to Question 2 of INPUT and the STP is automatically 
supplied by ST ATP AC. In the following description, L is the number of variables (the highest acceptable subscript sup- 
plied in Question 5 of INPUT). 



[ L, NAME(1),...,NAME(L) ] 

[NO] 
[Xj,X2,...Xl] 



Contains the number of variables per observation and their 
respective names 

NO is the number of observations which follow 

Contains one observation 



[NO] 
[Xj,X2,.-Xl] 



Contains one observation 

NO is the number of observations which follow 

Contains one observation 



[Xj,X2,.-,Xl] 



Contains one observation 



[o: 



Contains to indicate that zero observations follow, i.e., the 
end of the data file 



The entries in COMMON which must be made in the user supplied module are the following: 
ICNTRL Used by the CONTROL module 

CPBLTY Used if the user requests a different module once processing is completed 

by the module presently in core (used to read the answer to "*PROG") 

IFLAG May be used in the user supplied program to define multiple entry points 

into a chain if the module occupies more than one chain (set to 1 by 
CONTROL) 

When the user adds a module to STATPAC, the CONTROL module must be modified to allow the new module to be 
called. The following changes must be made to CONTROL, assuming that the new module is named ABODE and that 
chain number 20 is assigned to it when the executable file is built. (The CONTROL module is chain 1.) 

a. Increase the dimension of TABLE by 1 (i.e., for the first addition, TABLE (6) is the correct dimension). 

b. Add the following DATA statement to the CONTROL module: 

DATA ABCDE/5HABCDE/ 
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c. Increase the value of MODULS by 1 (i.e., for the first addition, MODULS/5/ is changed to MODULS/6/). 

d. The module selector statement "GOTO (101 ,102,103, 104,105),!" should be changed to add the state- 
ment number of a CALL CHAIN command. For the first addition, the GOTO statement is changed to: 

GOTO (101, 102,103, 104,105, 106),I 

and the following statement is added after statement 105: 

106 CALL CHAIN (20) 
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CHAPTER 4 
SAMPLE OPERATION 



Regression Analysis: 



The user dialogue with STATE AC and the possible data output are illustrated in this chapter. Each of the options and 
plots which may be requested in the modules is included in the output. The responses to the initial dialogue are for 
illustrative purposes only, and are not intended as examples of statistically meaningful responses. 

The user is referred to the following books for more complete descriptions of the statistical appHcations of the various 
options and plots. The selection of tolerances and test means, variances, etc., is discussed in these references. 

Descriptive Statistics: Statistics in Research 

by Bernard Ostle 
Chapter 7 

Quality Control and Industrial Statistics 

by Acheson J. Duncan, PH. D. 
Chapter 4 

Applied Regression Analysis 

by N. R. Draper and H. Smith 
Chapters 1,2,3,4, and 6 

BMD Biomedical Computer Programs 

edited by W. J. Dixon 
Pages 233-257 

Mathematical Methods for Digital Computers 

edited by Anthony Ralston, PH. D. 
and Herbert S. Wilf, Ph. D. 
Chapter 17 

The operation of the CONTROL module is not illustrated explicitly since the only question used specifies the analysis 
module desired by the user. 

The data used to illustrate the STATPAC modules was obtained fromfiMD Biomedical Computer Pro-ams, published 
by the University of California Press, used with permission of the editor, Mr. W. J. Dixon. 
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4.1 INPUT EXAMPLE 

The following Teletype listing is an example of the dialogue for the INPUT module. Any printed line which is started 
with an asterisk (*) is typed by ST ATP AC; all other hnes are typed by the user. Comments are added to the Usting to 
aid the reader. 



The message 



***WARNING - COMMON SIZE DIFFERS*** 



is often typed during the execution of STATPAC. This message is typed by CHAIN and should be ignored by the 
STATPAC user. 

Following the INPUT dialogue, a complete listing of the BMD data is provided to illustrate the format used. This data 
(its source was credited at the beginning of this chapter) is analyzed by the STATPAC modules and is used throughout 
this chapter to illustrate the operation of the various modules. 



$E STATPC 



*PROG 

INPUT 

+FILE (OLD) 

BMD '^SRC 

*FILE (NEW) 

BMD 

*FORMA 

(F7.2j;7 

*N0 . 

f10068 

*VARS 

VARi^l =ONEBM 

VAR02=TWOBM 

VAR03=TREBM 

VAR04=QRTBM 

VAR06=SIXBM 

VAR0 5=FI VBM 



■3 spaces 




■ note use of RUBOUT key 



7.0,2F7.2*2F7.0) 



blank record (i.e., simple carriage return) 



+ZERO OBS. 
*0 .K . 
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PIP)3SP1 
fifl ! 7S 

(^03P)3 

PI "161'^ 
P)P)5t5C1 

fi5r^p)P5 

1 4fl'^ 

'^0350 
«035« 
C)P!^50 

0070P) 
00400 
1 500 
00100 
00 3 50 
1 30 
00200 
1200 
00400 
00300 
00R00 
00 900 
00600 
00R00 

00 1 50 
00700 
00P00 
00200 
00600 

01 500 
0] 700 
01 600 
00300 
00 600 
01400 
00600 
001K0 
1 500 

1 800 
00500 
03000 
02900 
00! R0 

01 300 

1 900 

01 100 
01000 
00600 

00 500 
00100 

01 700 
00500 
00130 



00025 
00021 
00022 

00023 
000 10 
00007 
00006 
0000R 
00 1 R 
00003 
0000R 
00006 
00008 
00022 

1 3 
00026 
00023 
00003 

0001 5 
00028 
00006 
00035 

000 1 1 

0001 ! 
00032 
00008 
00023 
00038 
0001 5 
00 6 
00025 
00005 
00009 
00007 
00020 
00006 
000 1 2 
00026 
000 1 5 
00010 
0002« 
00034 
00004 
00032 
000 1 1 
00002 
000 1 8 
00003 
0000R 
000 14 
000 12 
00003 
0000 6 

12 

0001 1 
00008 
00024 
00026 
00029 

000 1 7 

0001 5 
00010 
00022 
0001 5 
00009 
00030 
000 10 



02500 
"12 1 00 
02200 
PI0 1 30 
02300 
00060 
001 40 
00080 
00270 
00360 
00 I 00 
00270 
00300 
10 
02200 

00 120 
02300 
00 100 
00250 

1 400 
00060 
03500 

01 100 

03200 
00 1 00 
02300 
3 800 
00500 

00 1 20 
02500 

001 70 
00075 
00350 
02000 
00086 
00400 

00 1 60 
00300 
00090 
02800 
03400 
00080 
03200 

01 100 
00050 

00 1 60 
00040 

001 10 
00090 
00240 

00 1 50 
00550 
00200 

01 I 00 
00800 
02400 

02 600 
02900 
1 700 
00 500 
00500 
02200 
00 500 
00300 
03500 
00 130 



00 1 50 
•^ "1 5] 8 7 
00043 
00 1 80 
00200 
00 3 30 
00340 
00500 
00 1 50 
00 1 80 

00 1 40 

001 00 
001 50 
00250 
00 11 
00280 

000 73 
00010 
00350 
00028 
00001 
00500 
00570 
00340 
00050 
00660 
00450 

0001 5 
00220 
001 50 
00370 
00100 
00030 
001 90 
00260 
00220 
00250 
00120 
00110 
001 60 

010 00 
00420 
00090 
00360 
00180 
00230 

00 180 

001 10 
00130 
00200 
00070 
00 1 50 
00080 
00570 
0041 
00200 

00 1 00 
00110 

001 70 
04800 
00 1 60 
00350 
00 1 00 
00120 
00080 
1 300 
00090 
00900 



3 4 
00036 
00 4 1 

000 1 5 
00033 

0001 3 

000 1 6 

0001 1 
000 1 9 

P 7 

0001 4 
00025 

0002 1 
000 1 8 
00046 
000 1 7 
0004R 
00036 
0000*5 
00033 
00046 

000 1 
00038 

0001 6 
00020 
00038 
00012 
00049 
00043 
00033 
00009 
00035 
00021 
0001 7 
0001 2 
00030 
0001 5 
00020 
00035 
00029 
00012 
00040 
00042 
0001 1 
00044 
0001 4 
0001 1 
00032 
0001 5 
0001 7 
00029 
00021 
00013 
00009 
0001 6 
00022 
00022 

00038 

00038 

00 02 9 
00025 

0001 9 
00026 
00039 
00029 
00010 
00058 
000 10 



00064 
00065 
00082 

23 
00064 

0001 6 
000 1 2 
00027 
0004R 
00 50 

000 12 
00013 
00020 
00023 

001 1 8 

Q\f^,Q\^ 5(7! 

00063 
001.50 
00072 
00054 
00109 
00010 
00125 
00044 
00048 
00105 
00009 
00130 

00 1 60 
00048 
00036 

001 50 
0007R 
00023 
00042 
00072 
00020 
00036 
00056 
00036 
00026 

00 1 08 
00106 

0001 6 
00104 
00047 

0002 7 
00012 

00007 
00018 
00028 
00025 
0001 1 
00020 
0001 4 
00038 
00 103 
00106 
00063 
00208 
00032 

00028 
00032 
00 1 00 
00050 

000R0 
00065 
00025 
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4.2 SMMRY EXAMPLE 

The following statistical output is the result of analysis by SMMRY on the previously presented BMD data and two 
arbitrary files, DATAl and DATA2. The arbitrary data files are included to allow the user to select all six options of 
SMMRY for demonstration purposes. Thus the complete printed output is presented with options. 



*PROG 



SMMRY 

***WARNING 

♦♦♦WARNING 



COMMON SIZE DIFFERS*** This message should be ignored. 
COMMON SIZE DIFFERS*** 



♦FILE 



User requests analysis of three data files - BMD, DATAl , and DATA2. 



BMD 




DATAl 




DATA2 




*VARS 




01 




04 




05 




02 




03 




*OPTS 




221355 




*MEAN 




MEN01 =1 


.6 


MEN02=1 


.7 


MEN03=1 


.8 


MEN04=2 


.3 


MEN05=2 


.5 


*VRNC 




VAR01=4 


.1 


VAR02=4 


.2 


VAR03=4 


.3 


VAR04=4 


.4 


VAR05=3 


5 



User types the variables to be analyzed - no specific order necessary. 



User requests all six options • any nonzero digit may be used to request 
an option. 

User types the values for the test mean of each variable - Option 1 
was requested. 



User types the values of the test variances of each variable - option 2 
was requested. 



*0.K. Processing will now begin. 

Carriage returns typed to terminate lists. 



DESCRIPTIVE STATISTICS 









BMD-^ 


Name of file being analyzed 










NO. ORS. = 


68 




VARIABLE 
NO. NAME 




MEAN 


VARIANCE 


STANDARD 
DEVIATION 


STANDARD 
ERROR 


1 ONEBM 

2 TWORM 

3 TREBM 

4 QRT8M 

5 FIVBM 









.69956E+01 
. 15250E+02 
10425E+02 
.30996E+0] 
.25397E+02 


0.41909E+02 
0.87563E+02 
0.13519E+03 
0.35966E+02 
0.15574E+03 


0.64738E+01 
0.93575E+01 
0.11 627E+02 
0.59971 E+01 
0. 12479E+02 


0.7S506E+00 
0. 1 13 48E+01 
0.1 4100E+01 
0.72726E+00 
0. 1 5134E+01 



44 



SKEWNESS 



KURTOSIS 



MAX 



MIN 



RANGE 



B. 1 500P:E+01 

0.89081E+00 
0.62546E+01 



P).2! 928E+01 

. 721 63E+00 
0.43371E+02 



0.30000E+02 

0.3R000E+02 
0.48000E+02 



.25000E+00 
f?.pr^0fflf^F + 0i 
.40000E+00 
0. 10000E-01 



0.29750E+02 

0.37600E+02 
0.47g90E+02 



5 0.43154E+00 -0.85983E+00 0.5R000E+02 0.50000E+01 0.53000E+02 



-CORRELATION MATRIX 

ONEBM 1 TWOBM 2 iTRERM 3) QRTRM 4 FIVBM 5 



ONEBM 
TWOBM 
TREBM 
ORTBM 
FIVBM 



1 

-0 



-0 



000000 

1 76465 
005134 
255482 
195689 



iO( Ol ^ ^T( *^ O 

I I'j 1/9 1/5 vy f I t-o 

.867992 
. 100719 
.879120 



.000000 
. 125866 
.751937 



000000 
1 40437 



1 .000000 



- The name and subscript of the third variable in the 
ysis. 



VARIABLE 
NO. NAME 

1 ONEni 

2 TWO01 

3 TRE01 

4 QRT01 

5 FIV01 



MEAN 



.45600E+02 
0.31300E+03 
0.24136E+03 
0. 10R68E+03 
0.37440E+02 



DATAl 
NO. OBS. = 2 5 
VARIANCE 



. 1241 7E+02 
R . 14840E+05 
. 1 5286E+04 
0.32631E+03 
0.23059E+03 



STANDARD 
DEVIATION 

0.35237E+01 
0.121 82E+03 
0.39097E+02 
0. 18064E+02 
0. 15185E+02 



STANDARD 
ERROR 

. 70475E+00 
O.24364E+02 
0.78194E+01 
0.36128E+01 
0.30370E+01 



SKEWNESS 

0.4651 7E+00 
0.68285E+00 
14744E+01 
0.46925E+00 
0.33407E+00 







KURTOSIS 

0.12996E+01 

0. 10441E+01 

0. 1 8508E+01 

-0.84454E+00 

-0. 13492E+01 



MAX 

0.51000E+02 
0.67700E+03 
0.35700E+03 
0. 14700E+03 
0. 64000E+02 



MIN 

0.35000E+02 
0.13900E+03 
19800E+03 
0..79000E+02 
0. 1 5000E+02 







RANGE 

0. 1 6000E+02 
0.53800E+03 
15900E+03 
0. 68000E+02 
0.49000E+02 







CORRELATION MATRIX 

ONE01 1 TWO01 2 TRE01 3 QRT01 4 FIV01 5 



1 


ONE01 


1 .000000 










2 


TWO01 


0.(^1 1 551 


1 .000000 








3 


TRE01 


-0.291373 


-0.179528 


1 .000000 






4 


9RT01 


0.352039 


-0.029065 


-0.460303 


1 .000000 




5 


FIV01 


0.310231 


-0.1 64700 


-0.333221 


0.938052 


1 .000000 
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OATAf? 



NO. OBS. 



1^ 



VARIABLE 
NO. NAME 

1 ONE02 

2 TWO02 

3 TRE02 
^ flRT02 
5 FIV'^2 



MEAN 



0.51 5P|PtE+02 
0. 12400E+P)2 
.25R0^E + 0P 
. 9P340E+02 



VARIANCE 



0.39956E+02 
0.2flF:2P?E+03 
0.47R22E+02 
IT] . ! R93 7E+03 
.2021 4E+03 



STANDARD 
OEVIATION 

R.63210E+01 
0. M432E+02 
0. 691 54E+01 
0. 13750E + 02 
0. 1421 8E+02 



STANDARD 
ERROR 

0. 1 9989E+01 
0.45637E+01 
.2 1868E+01 
0.43482E+01 
.44961 E+01 







SKElJNESS 




KURTOSTS 




MAX 




MIN 


1 


f-1 


. 58356E+00 


-0 


7051 I E + 00 





21 000E+0? 





1 0000E+0I 


p 


-0 


. 12735E + 00 


-0 


1 53^0£+01 





71000E+02 





31000E+02 


3 





.37643E+00 


-0 


16785E+01 





23000E+02 





40000 E+0 1 


4 





• 1289RE+00 


-0 


14738E+0I 





47000E+02 





600 00 E+0 1 


5 


-0 


.35647E+00 


-0 


13386E+0 1 





1 1 590E+03 





72500E+02 



RANGE 

.20000E+02 
. 40000E+02 
. 1 9000E+02 
0.41000E+02 
0.43400E+02 



CORRELATI ON MATRIX 

ONE02 1 TW0n2 2 TRE02 3 QRT02 4 FIV02 5 



1 ONE02 1 .000000 

2 TWO02 0.095004 1.000000 

3 TRE02 -0.872371 -0.249384 1.000000 

4 QRT02 -0.107896 -0.972027 0.150505 1.000000 

5 FIV02 0.729910 0.728978 -0.714846 -0.737734 



. 1 .000000 



OPTION 1 
VAR USER MEAN 



BMD 



N- - 1 (degrees of freedom for each data file) 




1 0.1600E+01 0. 

2 0.1700E+01 0. 

3 0.1 800E+0 1 , 

4 0.2300E+01 0. 

5 0.2500E+01 0. 

/ 

Answers to Question 4 



6873E+01 
1 1 94E+02 
61 1 7 E+01 
1 099E+0 1 
1 51 3E+02 



1 



6243E+02 
1 278E+02 
30 6 4 E+0 2 
2945E+02 
1 1 S0E+02 



1 



0.3102E+01 
0-1091 E+02 
0.4847E+01 
0. 5405E+01 
0.21 32E+02 



1 



t-values to be compared with 
values obtained from t-table. 



OPTION 2 
VAR USER VARNC 



BMD 



67 



DATAl 



24 



DATA2 



. 4100E+01 
. 4200E+01 
• 4300E+0 1 
. 4400E+0 1 
.3500E+0 1 



Answers to Question 5_T 



0. 6849E+03 
. 1397E+04 
0.2106E+04 
0-5477E+03 
2981E+04 



0. 7268E+02 





8771 E+02 


0.8480E+05 





4463E+03 


0.8532E+04 





1001 E+03 


. 1 780E+04 





3 8 6 7 E+ 3 


0. .1 58 1 E+0 4 





51 98E+03 



_I_chi-square values to be compared with values obtained 
from chi-square table. 
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Oi^TION 3 



VAR 



UAf? = 



VAR 



VAR = 



VAR 



TMO 67 HATAl P4 OATAP 



BMD 


6 7 


I'l.onnr^lE+f^C'i 


DATA] 


??/' 


i^.?rp5e;+'^2 


0ATA2 


o 


P!.3 6 79E + '^'^ 



(-g1 .PPiPSE+^P) -;^ .3679E;+'7,r.i 

[71 . (Tr^f7iP!ir+c»ci q . Qp 53 [r+ piP 

-''I •PP63E + F1P q .nf7|(;iOE+'^P: 



t-statistic comparing variable 1 in BMD and DATAl. 



RMD 67 OATAl P 4 DATAP 9 

BMD 6 7 0.r/l0Pi!7jE+R!71 -0.P'^1RE+P1P -0. 1361 E+ PIP 

DATAl P4 3.3^1 1 PE + P)P PI . C1W0f1 E + Plf^ P).6710E+P:i 

DATA2 9 P).1061E+BP -0 • 6 71 f5 E+O 1 . '■^0P1P)E+ PIR 



BMD 67 DATAl 24 DATA2 9 

BMD 67 0.0P1R0E+00 -0.4404E+0S -0.52 19E+00 

DATAl 24 P).4404E + 02 0.0000E+00 0.1825E+02 

DATA2 9 0.5219E + 00 -0.1R25E+02 0.(7l000E+00 



BMD 67 DATAl 24 nATA2 



BMD 


67 


. 00f50E+00 


-0.4255E+02 


-0.91 13E+01 


DATAl 


24 


.4255E+02 


.0000E+00 


0. 1303E+02 


DATA2 


9 


0.91 13E+01 


-0. 1303E + 02 


. 0P100E+00 



BMD 67 DATAl 24 DATA2 9 

BMD 67 0.0000E+0n -0.3PR7E+01 -0.1696E+0P 

DATAl 24 0.3«87E+01 0.0000E+00 -0.1090E+02 

DATAP 9 0.1696E+02 0.1090E+0P 0.0R00E+0P 
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OPTION 4 



VAR = 1 



BMD 67 DATAl PA DATA2 9 

BMD 67 fl.l000E+0t PI.3375E+01 0.ia49E+01 

DATAl !?4 0.P963E + 00 0.1000E+01 0.3108E+00 

DATA2 9 0.9534E+00 0.321f?E+0t 0.1000E+01 



VAR 



BMD 67 DATAl 24 0ATA2 9 

BMD 67 0.1000E+01 0.5901E-02 0.4g04E+00 
DATAl 24 0.1695E+03 0. 1000E+01 0.7125E+02 
DATA2 9 0.2379E+01 ([^ . 1 404E-01) 0.1000E+01 

F-statistic comparing variable 2 in DATA2 and DATAl . 



VAR = 



BMD 67 DATAl 24 DATA2 9 

BMD 67 0.1000E+01 0.RR44E-01 0.2827E+01 

DATAl 24 0.1131E+02 0.1000E+01 R.3196E+02 

DATA2 9 0.3537E+00 0.3129E-01 0.1000E+01 



VAF! 



VAR 



RMD 6 7 DATAl 2 4 DATA2 



BMD 


6 7 


0. 1000E+01 


. 1 102E+00 


DATAl 


2 4 


0.9073E+0! 


. 100OE + 01 


DATA2 


9 


0.5257E+01 


0.5794E+00 



0. 1 902E+00 
0. 1 726E+01 
0. 1000E+0I 



BMD 67 DATAl 24 DATA2 



BMD 


67 


0. 1000E+01 


0.6754E+00 


DATAl 


24 


0. 1481E + 01 


. 1000E+01 


DATA2 


9 


0. 129RE+01 


.R766E+00 



. 7704E+00 
0. 1 141E + 01 
. 1 000E+01 
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OPTION 


5 


VAR 


F-VALUE 


1 


B.4252E+03 


2 


0.2429E+03 


(3 


PI. 1 142E+04)^ 


4 


P).95R4E+03 


5 


i:^.2280E+03 


VI 


= ^W 


VP 


100f 



F-statistic comparing variable 3 in all files of the analysis. 



Degrees of freedom parameters. 



OPTION 6 

VAR UNCORRECTED CORRECTED 

i 0.1062E+C52 a.9609E+01 

2 0.2417E+03 0.21«7E+03 

3 0.7398E+02 0.6694E+02 

4 0.5215E+02 0.4719E+02 

5 PI.1527E+01 0.13F:1E+01 

K-1 = 2 

C = . 11 5 E+ l-^h- Correction factor 



4.3 STPRG EXAMPLE 

The following example Olustrates the STPRG module of ST ATP AC including all options and plots. The previously 
presented BMD data is analyzed with variable 6 as the dependent variable. STPRG performs four steps and stops after 
including independent variables 13>4, and 5 in the regression model. Variable 2 was not entered into the model be- 
cause the value of F (ENTER) never exceeded the value of F-IN (=0.5). 

*PR06 
STPRG 
+ **WARNING - COMhON SIZE DIFFERS*** This message should be ignored. 

*F I LE User types the name of the file to be analyzed. 

BMD 

*VARS User types the variables to be considered for the model. The last 

1 variable listed is used as the dependent variable. 

4 
5 
02 
03 
06 

*F I N User types the F-value to determine entry. 

* 5 

+F CI )T User types the F-value to determine exit. 

.3 

+ L I M User types the maximum number of iterations. 

06 

* T L User requests the default tolerance (0.00 1 ). 

* P T S User requests all options. 
1111 

*C OEF User supplies the test coefficients for option 2. 

COFHl =1 . 1 
C0FM2=1 .2 
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COF03=1 


.3 


COF05=1 


.4 


COF04=1 


.5 


COF02=1 


.2678 


*FCTR 




g.0 




♦ PLTS 




111! 




♦ O.K. 





User types simple carriage return. 
User supplies the value for t(N-p,l-i4a). 

User requests all plots. 

Processing will now begin. 



?EGR£SSIOM OUTPUT CSTp:"M:-) 



Stepwise regression 



OATa FILE BMO-* 
NO. OR.S. 6R 
RESP. 6 SIXBM 
TOL . q.'^niPli> 
F-IN Pl.bi^Om 
F-OUT f1.30«00 



- Name of file (Question 1). 

- Subscript and name of dependent variable. 
-Response to Question 6. 

" Responses to Questions 3 and 4. 



V/ARTABLE 



CORR. X.VS.Y 





ONEBM 


fl .0P205A 


2 


TWOBM 


PI. 749 61 7 


3 


TREBM 


i^. 7«605R 


A 


ORTBM 


0.34256R 


5 


FIVBM 


CI. 6451 67 



Response to Question 2. 



Responses to Question 5 of INPUT. 



STEP NO. 



1 



VAR. ENTERING 3 TREBM* 
SEQ. .F-TEST 1^6.724- 
DEGREES OF FREEDOM 66 
CHANGE IN R-SQ 'n.617P;R8- 
R-SO T.617RR5!. 

STO. ERR. Y 27. 123R- 



- Subscript and name of first variable in model 
-F-test to determine entry (106.724 > 0.5) 

- Change in multiple correlation coefficient 

- New multiple correlation coefficient 

- Standard error of dependent variabU. 



ANOVA" 



.SOURCE 

TOTAL 

REGR.S. 

RE.SID. 



Analysis of Variance 

O.F. .SUM OF .SQUARES MEAN SQUARE 



67 P! . l?7i173E+n6 

1 . 7851 69E + C1b 

66 0.4R556PE+05 



0. 7851 69E+fl5 
0. 735700E+03 



OVERALL F 



'•1. 10 6724E+'^3 



VAR. IN REG. 

VARIABLE 

3 TREBM 



COEFFICIENT 
n.294426E+0 1 



STO. ERROR 



^ .2P;5CTMnE+n0 



(REMOVE) 



-71 , i(7l6724E+03« 



B0 



26.099«- 



-Constant term of the model 



F-value needed to remove this variable from 
the present regression model 



] 
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MIAL CORR. 


F CENTER) 


<^.\P.6PA6 . 


n. !'^5225E+P)1 


PI. 21 9325 


O.32«^-^73E+01 


(^.397285 


0. 121H21E+'12 


«. 132 7 6P) 
Partial correlation 


fl. 11 662 1 E+f^l 
F-values needed to enter these 


withy 


variables in the present regression 




model 



V^R. NOT IN REG. 

VARIABLE TOLERANCE 

1 ONEBM ^.999974 

2 TWOBM O.P4659^ 

4 QRTBM fl.9R41S8 

5 FIVBM 0.434591 



STEP NO. 2 

VAR. ENTERING 4 QRTRM-* Variable 4 enters regression model 

.SEQ. F-TEST 1 2 . 1 P;2 1 

DEGREES OF FREEDOM 65 

CHANGE IN R-SO 0.060311 

R-SQ 0.678199 

STD. ERR. Y 2 5.08P1 



ANOVA 

SOURCE D.F. SUM OF SQUARES MEAN SQUARE , OVERALL F 

TOTAL 67 0.127073E+06 

REGRS. 2 0.86IR08E+05 0.430904E+05 0.6R4940E+02 

RESIO. 65 0.40B923E+05 0.6291 12E+03 

VAR. IN REG. 

VARIABLE COEFFICIENT STD- ERROR F (REMOVE) 

3 TREBM 0.2R2755E+01 0.265660E+00 0.113284E+03 

4 ORTBM 0. 1 79768E+'^1 . 5 1 50 52E+0C1 0.121821E+02 

B0 = 21 . 7445 

VAR. NOT IN REG. 

VARIABLE TOLERANCE PARTIAL CORR. F (ENTER) 

1 ONEBM 0.933987 0.027242 0.475306E-01 

2 TWOBM 0.246516 0.246529 0.414141E+01 

5 FIVRM 0.378439 0.321789 0.739256E+01 



STEP NO. 3 

VAR. ENTERING 5 FIVBM-* Variable 5 enters the regression model 

SEO. F-TEST 7.39256 

DEGREES OF FREEDOM 64 

CHANGE IN R-SQ 0.033322 

R-SQ 0.711521 
STD. ERR. Y 23.9328 



MEAN SQUARE OVERALL F 



.301384E+05 0.526176E+02 

.572781E+03 



ANOVA 








SOURCE 


D.F. 


SUM OF SQUARES 




TOTAL 


67 


0. 127073E+06 




REGRS. 


3 


0.9041 51E+05 





RES ID. 


64 


0.366580E+05 
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VAR. IN REG. 

VARIABLE COEFFICIENT STD. ERROR F (REMOVE) 

3 TREBM 0.195840E + O1 . 4PI 79 7^E+0f1 0.230429E+02 

4 QRTBM 0.231239E+01 . 526652E+PI0 0.1927R6E+02 

5 FIVBM 0.103553E + 01 ci . 3i?n860E+00 0.739256E+01 

B0 = 2.91072 

VAR. NOT IN REG. 

VARIABLE TOLERANCE PARTIAL CORR. F (ENTER) 

1 ONEBM 0.R83169 0.111114 0.787547E+00 

2 TWOBM 0.113447 0.015737 0.156066E-01 



STEP NO. 4 M Last step ■ no more variables to be entered or removed 

VAR . ENT ER I N G 1 ONEBM -* Variable 1 enters regression model 

SEQ. F-TEST 0.7R7547 

DEGREES OF FREEDOM 63 

CHANGE IN R-SQ 0.003562 
R-SQ 0.7150R2 

STO. ERR. Y 23.9727 



ANOVA 






SOURCE 


. F . 


SUM OF SQUARES 


TOTAL 


67 


n. l?7n73E+n6 


REGRS. 


4 


0.90i=:677E+05 


RES ID. 


63 


0.362O54E+05 



MEAN SQUARE OVERALL F 



0.227169E+05 0.395291E+02 

m. 5746P9E+03 



VAR. IN REG. Allvaluesare greater than F-OUT (0.3)- 

VARIABLE COEFFICIENT STD. ERROR F (REMOVE) 



1 ONEBM 0.4272PIRE+0P1 "l . 4R 1 394E+0a . 7R7547E+00" 

3 TREBM PI. 1R9677E+01 0.414512E+00 0.209389E+02 

4 QRTBM 0.223335E+01 0.534995E+00 0.174266E+02 

5 FIVBM '^. I 1 1 674E+01 0.39P316E+00 PI . 8 1 (^276E+0 1 



B0 = -1 .252R4 



Less than F-IN (0.5) —I 



VAR. NOT IN REG. 

VARIABLE TOLERANCE PARTIAL CORR. F (ENTER) 
2 TWOBM 0.102908 '^.052408 . 1 70 755E+00* 
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ESIDUAL 


OUTPUT 


NO. 




0F3S 


1 


(A . 


-6400O0E+0P 


2 


0. 


650000E+02 


3 


0. 


.820000E+02 


A 


0. 


.230000E+02 


5 


0. 


.640000E+02 


6 


0. 


■ 1 60000E+02 


7 


0. 


> 1200Pi0E + 02 


8 


0. 


270000E+02 


9 


0. 


.480000E+02 


IPl 


0. 


. 500000E+02 


1 1 


0. 


. 1200fl0E+02 


12 


0. 


. 130000E+02 


13 


0. 


.200000E+'^2 


14 


0. 


.230000E+02 


15 


0. 


, 1 1 8000E+03 


16 


0. 


, 500000E + 02 


17 


0. 


.630000E+0P 


IF? 


0. 


. 150000E+03 


19 


0. 


, 720000E+02 


20 


0. 


, 540000E+02 


21 


0. 


. 109'^00E + 03 


22 


0. 


. 100000E+02 


23 


0. 


■ 125000E+'03 


24 


0. 


440000E+02 


25 


0. 


.4R0000E+02 


26 


0. 


. 105000E+03 


27 


0. 


.900000E+01 


28 


0. 


. 130000E:+03 


29 


0. 


, 1 60000E+03 


355 


0. 


. 480000E + 02 


31 


0. 


.360000E+02 


32 


0. 


. 150000E+03 


33 


0. 


. 780000E+02 


34 


0. 


.230000E+02 


35 


0. 


. 420000E+02 


36 


0. 


. 720000E+02 


37 


0. 


.200000E+02 


38 


0. 


.360000E+02 


39 


0. 


. 560000E+02 


40 


0. 


.360000E+02 


41 


0. 


.260000E+02 


42 


0^ 


. 108000E+03 


43 


0> 


. 106000E+03 


44 


0^ 


. 160000E+02 


45 


0. 


. 104000E + 03 


46 





. 470000E+02 


47 





.270000E+02 


48 





. 120000E+02 


49 





.700000E+01 


50 





. 180000E+02 


51 





.280000E+02 


52 





.250000E+02 


53 





. 1 10000E+02 


54 





.200000E+02 


55 





. 140000E + 02 


56 





.380000E+02 



PRED 

0.885535E+02 
0.862786E+02 
0.887180E+02 
0.22731 7E+02 
0.R49735E+02 
0.226273E+02 
0.292135E+02 
.262787E+02 
.289919E+02 
0.41 8R36E+02 
0.21 5410E+02 
0.353019E+02 
0.320935E+02 
0.27I830E+02 
0.947300E+02 
0. 503 519E+'-:i2 
0. 564708E+02 
0.829056E+02 
0.200252E+f12 
0.420349E+02 
0. 7R1895E+02 

0.237146E+02 
0. 121368E+03 
.2821 55E+02 
0.4391 75E+02 
0.11 9610E+03 
0.258037E+02 
0. 103836E+03 
0. 124185E+03 
0-499287E+02 
0.248910E+02 
0.883400E+02 
0.312197E+02 
0.251065E+02 
0.258751E+02 
0- 7851 57E+02 
0.265577E+02 
0.339123E+02 
0.467423E+02 
0.410371E+02 
0.391 791 E+02 
0. 109324E+03 
0- 1 13005E+03 
0.231520E+02 
0. 1 19008E+03 
0.476452E+02 
0.228350E+02 
0.412560E+02 
0.21 7236E+02 
0.302658E+02 
0.369663E+02 
0-308700E+02 
.243047E+02 
.396499E+02 
0.31 7013E+02 
0. 61 4628E+02 



RES ID 

-0.245535E+02 

-0.212786E+02 

-0.671 796E+01 

0.268293E+00 

-0.209735E+02 

-0.662731E+01 

-0. 1 72135E+02 

0. 721 302E+00 

0. 190081 E+02 

0. 81 1 641 E+ 01 

-0.954I02E+01 

-0.223019E+02 

-0. 120935E+02 

-0.418304E+01 

0.232700E+02 

-0 . 35 1 9 i 4E+00 

0.652920E+0 1 

0. 670944E+02 

0.519748E+02 

0. 1 19651E+02 

.3081 05E+02 

-0. 1371 46E+02 

0.3631 78E+01 

0. 1 57845E+02 

0.40g250E+01 

-0. 146104E+02 

-0. 1 68037E+02 

0-261 638E+02 

0.3581 53E+02 

-0. 192869E+01 

0.11 1090E+02 

0-61 6600E+02 

.467803E+02 

-0.210652E+01 

0. 161249E+02 

-0.651573E+01 

-0.655773E+01 

.208769E+01 

0.925773E+01 

-0.50371 1 E+01 

-0. 131791 E+02 

-0. 132396E+01 

-0-700476E+01 

-0.71 5201E+01 

-0. 1 50084E+02 

-0.645188E+00 

0.41 6496E+01 

-0.292560E+02 

-0. 1 47236E+02 

-0- 122658E+02 

-0.896633E+01 

-0.586995E+01 

-0 . 133047E+02 

-0. 196499E+02 

-0. 177013E+02 

-0.234628E+02 
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57 
58 
59 
60 
61 
62 
63 
64 
65 
66 
67 
68 



P). 1030P)0E + 03 
0. 106000E + 03 

630000E+02 
.208000E+03 
0.320000E+02 
0.280000E+02 
0-320000E+02 

100000E+03 
0. 500000E+02 
0.800000E+02 
0. 650000E+02 
0.250000E+02 











0.531 120E+02 
0.89931 4E+02 
0.998496E+02 
0.201456E+03 
0.671833E+02 
0.41 5379E+02 



.420629E+02 
.888450E+02 
. 428304E+02 
•51 9009E+02 
. 134051E+03 
•330358E+02 



0.498880E+02 

0. 160 686E+02 

-0 .368496E+02 

0. 654361 E+01 

-0.351833E+02 

-0. 135379E+02 

-0. 100629E+02 

0.11 1550E+02 

0.71 6963E+01 

0.280991E+02 

-0.690510E+02 

-0.803584E+01 



OPTION 1 



VARIABLE 



5 FIVBM 



T-VALUE 



1 ONEBM 0.887438E+00 

3 TREBM 0.457590E+01 

4 ORTBM 0.417451E+01 



0.284654E+01 



Note that t-statistics are computed 
only for variables included in the 
model 



OPTION 2 



VARIABLE 



i 



USER COEFF. 



. Response to Question 8 



T-VALUE 



1 ONEBM 

3 TREBM 

4 QRTBM 

5 FIVBM 



0. 1 10000E+01 
0. 130000E+01 
0. 1 50000E+01 
0. 140000E+01 



-0. 139759E+01 
0. 143968E+01 
0. I37075E + 01 

-0 . 72201 7E+00 



OPTION 3 



T( N-P , l-ALPHA/2 ) = 2.00000-^ 

VARIABLE LOWER BOUND UPPER BOUND 



•Response to Question 9 (FCTR) 



1 ONEBM 

3 TREBM 

4 QRTBM 

5 FIVBM 



-0.535581E+00 
0. 106774E+01 
0.11 6335E+01 
0.332109E+00 



0. 139000E+01 
0.272579E+01 
0.330334E+01 
• 1 9£^i 137E + i7rl 



N-P 



63- 



-Degrees of freedom 
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PLOT 1 

0.671E:+02. 

0.626E+02. 

0.580E+02. 

.535E+02. 

0.489E+02. 

0.444E+02- 

0.399E+02. 

0.353E+02. 

0.308E+02 . 

0.263E+02. 

.21 7E+02 . 

0. 172E+02. 

. I26E+02. 

0.810E+01 . 

0.356E+01 . 
-0.978E+00. 
-0.552E+01 . 
-0. 101E+02. 
-0. 146E+02. 
-0- 1 91 E+02 . 
-0.237E+02. 
-0.282E+02. 
-0.327E+02. 
-0.373E+02. 
-0.418E+02. 
-0.464E+02. 
-0.509E+02. 
-0.554E+02. 
-0. 600E+02 . 

-0.645E+02 . 



YPREQ (X) VS RESID (Y) 



Horizontal . 



1 
1 

2 1 
2 
11 11 

1 1 1 

3 2 2 
2 2 3 
2 2 

1 1 



•:n (X) VS R 



.Vertical 



1 1 



Two observations with same 
(^^- value of y predicted and 
equal residuals 



-0.691E+02. 



fTi . QrTlcl w+rAO 



.805E+'^2 
. 503E+02 



0. 141E+03 
0.111 E+03 



.201 E + 03 
0.171 E+03 
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PLOT 9. 

0.2(^8E: + p13. 
0-201E+03. 

?i .1 gsE+i^a. 

0. 188E + C13. 
0. 181E+03. 
0. 1 74E+03. 
0. 1 68E + 03. 
0.161 E+':i3. 
0. 1 54E+03 . 
0. 148E+03. 
. 14IE + H3 . 
0- 134E+03. 
0. 128E+03. 
0.121E+03. 
0. 1 14E+03. 
0. 107E+03. 
0. 101E+03. 
i'j.941E+02. 

0.874E+02. 

0.807E+02 . 

0.740E+02. 

1 
0.673E+02. 



YPRED CX) VS yOBS (Y) 
Horizontal i x Vertical 



1 1 
1 3 11 

1 1 1 



■606E+02. 
0.539E+02. 
0.472E+02. 

0.405E+02. 

1 1 1 
0.338E+02. 

1 2 
0.271E+02. 

3 3 11 
0.204E+02. 

2 4 1 
0-137E+02. 

5 111 

0.700E+01 

+ 



1 1 



1 

1 1 ©. 


•—Two observations with same 




ralue of y predicted and y 


1 


observed. 



2 1 



0.200E+02 0.805E+02 0.141E+03 0.201E+03 

0.503E+02 0.111E+03 0.171E+03 
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PLOT 3 



-0.691E+02 



TIME VS RESID 
i i 

Vertical T 1 Horizontal 

-0.230E+02 0.230E+02 



0.691E+02 



-0.460E+02 

+ 



-0. 715E-06 
+ + 



0.460E+02 

+ 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 

1 1 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 



Observation ordinal is used as the 
measure of time. Plot should show 
* uniform scatter with no discemable 
pattern 
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57 
58 
59 

60 
61 
62 
63 
6 4 
65 
66 
67 
68 



PLOT A 



rr:.sid histogram 



10 


Number of observations with a 










9 


given residual value 










R 












7 












6 




1 








5 




1 








4 


1 1 


1 


1111 






3 


1 1 




11111 






2 


1 1 1 


1 


11111 


1 


1 


1 


1 111111 




1 1 I 1 1 


1 1 


1 1111 




+ + + 




+ 


+ 


+ + 


.691E4 


02 -0.230E+02 




0.230E+ 


02 


0.691E+02 




-0.^60E+02 


-0 


. 71 5E-06 




0.460E+02 



Value of the residual 



4.4 MLTRG EXAMPLE 

The following sample output is the result of analysis of the DATAl file with the MLTRG module of ST ATP AC. The 
output does not include options and plots, as the MLTRG options and plots are the same as those of STPRG previously 
presented. 



*PROG 

MLTRG 
♦♦♦WARNING 

+ EILE 

DATAl 
*VARS 

06 

*TOL 

.00103 

+OPTS 

1001 

*PLTS 

0111 

*0 .K. 



COMMON SIZE DIFFERS*** This message should be ignored 



Last variable typed is the dependent variable (all variables in the file 
will be entered in the regression model) 



Only options 1 and 4 are requested 
Only plots 2, 3 and 4 are requested 
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REGRESSION OUTPUT (iMLTRG) 



OftTA FILE 


OATAl 


NO . OBS • 


?5 


RESP. 6 


sixai 


TOL. m 


0P11P)3 


VARIABLE 


CO 


1 ONEf^l 




2 TW0511 




3 TREPIl 




A 0RTP1! 




5 F I veil 





- Dependent variables 



X.VS.Y 




All independent variables are included 
in the model 



R-sra 

STD. ERR. Y 



.494342 
1 .15539 



ANOVA 

SOURCE 

TOTAL 

REGRS. 

RESID. 



D.F. 

9.A 

5 

1 9 



SUM OF SQUARES 



. 5P)1 6C5«E+fl2 
.^47962E+f)2 
.25363P;E+02 



MEAN SQUARE 



(7I.495924E+01 
<A. 1 33494E+01 



OVERALL F 



9i,21\ 496E + «1 



VAR. IN REG. 
VARIABLE 

1 ONEf^l 

2 TWOfil 

3 TREf^l 

4 ORTPIl 

5 FIV0)1 



COEFFICIENT 

P).43f58f^4E-02 
0. 722737E-02 
0. 1 5B1 92E-P)1 
0. 1 48276E-01 
0.339402E-W1 



STO. ERROR 

[^.7240 55E-f11 
0.2121 R3E-02 
. 72R874E-02 
.444280E-01 
. 505993E-01 



B0 



-6.71895 



4-19 



APPENDIX A 
DESCRIPTIVE STATISTICS ALGORITHMS 



Xjj: mean of the j variable in the i file. 



N= 



kni=l 
cr-j : variance of the j variable in the i file. 



N: 



'} = [ Z (Xjim-Xji)2]/(Ni-l) 
<m=l 



fVi til 

Oz^. standard deviation of the j variable in the i file. 



aji = vaj, 



S.E^^: standard error of the mean of the j variable in the i file. 



S.E.., = a.,ly/^- 



SKEWNESS-: coefficient ofskewness of the j^^ variable in the i* file. 



N= 



-V ^3 



Z (Xjim-^ir) /Ni 
SKEWNESS;; =^^£i r 



Ji 



Ji 



A-1 



KURTOSIS-jC coefficient ofkurtosis of the j^^ variable in the i^^ file. 



N: 



2 (^im-VT^i 



KURT0SIS,.=^-S1=L 



Ji 



-3 



^Ji 



Cj.^: simple correlation coefficient between the r and s variable in the i file. 



N- 



C„ =^- 



£ <^rim " ^ri) P^sim " ^si^ 
m=l 



rs 



Ni ■>! fNi 



Z (Xrim-Xri^N I i^sun-^ii^ 
m=l I lm=l 
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APPENDIX B 
REGRESSION ANALYSIS ALGORITHMS 



corr (Xj,y): correlation of the i dependent variable with the independent variable. 

N 

Z (Xiin-^i)(ym-y) 

corr (Xj,y) = - ™~^ 



N /N 

m=l V m=l 



r : multiple correlation. 

N 



Z (Yi-y) 

,2- i=l 



= 1 -a^ 



N "" 

Z (yi-y)2 

i=l 



s^: standard error of y. 



'N 
Sy= / Z (Ym-y)^ Vann/(N-1-P) 



m=l 
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Sequential F-test (Entering). 

where V^^^ = maximum Vj, V^ = a-^ ^J^-- 

a^: = elements of the correlation matrix 

n = total number of variables being analyzed 
<p = degrees of freedom (N-l-p) 

Sequential F-test (Leaving) 

F = {(|Vminl)-4Kn 
where V^^^j^ = minimum Vj, Vj = Ej^ a^^ja- 

and other symbols as above. 

ANOVA Analysis of Variance Table 

D.F. (Degrees of Freedom) Total N - 1 

Regression p 
Residual N-l-p 



Sum of Squares 



Total 



N 

sSy= Z (ym-y)^ 

m=l 



Regression SSj.gg = SSy (1 - aj^^^) 



Residual SS^.^j^ = ^Sy- SS^.g 



Mean Square 



Regression ms^^g = SS^gg/p 
Residual ms^^^j^j = SS^^^-^KN-\-p) 



Overall F 



Regression mSjeg''"^^resid 



Table of Variables in Regression 



bji coefficient of the i'^^' variable. 



''i - ''in-oT 



where b: = i element of last (n ) column of inverted correlation matrix. 




I (ym-y) 



-^T^2 
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Sj : standard error of b- . 



r^VET 



where s^ is standard error of y ; 



a- is defined as above; 



by is diagonal element of the correlation matrix. 



Fji F-test to remove the i variable. 







~ 


bi' 


F- 


= 




1 


1 












s- 








1 



bg : constant term of model. 



.0=7- I h^i 



T 

i=i 

where T = number of variables in regression 

Table of Variables not in Regression 

tol; : tolerance of i variable. 



til 

tolj = aj^ (the i diagonal element of the inverted correlation matrix) 



nn 



Partial correlation of the i variable. 



part.corr.j = ajjj/Vari 

F| : F-test to enter the i variable, 

F-{aj„2(^.l)}/(aya^„-ai„)2 
where (^ = N - 1 - p 
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Residual Output 

y- : predicted value of the dependent variable for the i observation. 

T 

yi=feo+ Z'^j^ji 

j=i 
where T = number of variables in regression. 



Cj : residual for the i observation. 



ei = yi-yi 
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HOW TO OBTAIN SOFTWARE INFORMATION 



Announcements for new and revised software, as well as programming notes, software 
problems, and documentation corrections are published monthly by Software Informa- 
tion Service in the following newsletters. 

Digital Software News for the PDP-8 Family 
Digital Software News for the PDP-9 Family 
Digital Software News for the PDP-15 Family 

These newsletters contain information applicable to softv/are available from Digital's 
Program Library (see title page for address). Software products and documents are 
usually shipped only after the Program Library receives a specific request from a user. 

Digital Equipment Computer Users Society (DECUS) maintains a user library and pub- 
lishes a catalog of programs as well as the DECUSCOPE magazine for its members and 
non-members who request it. 

Please complete the card below to receive information on DECUS membership or to place 
your name on the newsletter mailing list. 



Please send 
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publications. To do this effectively we need user feedback - your critical evaluation of this manual. 

Please comment on this manual's completeness, accuracy, organization, usability, and readability. 
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DEC also strives to keep its customers informed of current DEC software and publications. Thus, the following period- 
ically distributed publications are available upon request. Please check the appropriate box(s) for a current issue of the 
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[~1 PDF- 15 Software Manual Update, a quarterly collection of revisions to current software manuals. 
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